performance guarantee
- North America > United States > Texas > Travis County > Austin (0.40)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- Research Report > New Finding (0.65)
- Research Report > Experimental Study (0.51)
Belief-Dependent Macro-Action Discovery in POMDPs using the Value of Information
This work introduces macro-action discovery using value-of-information (VoI) for robust and efficient planning in partially observable Markov decision processes (POMDPs). POMDPs are a powerful framework for planning under uncertainty. Previous approaches have used high-level macro-actions within POMDP policies to reduce planning complexity. However, macro-action design is often heuristic and rarely comes with performance guarantees. Here, we present a method for extracting belief-dependent, variable-length macro-actions directly from a low-level POMDP model. We construct macro-actions by chaining sequences of open-loop actions together when the task-specific value of information (VoI) --- the change in expected task performance caused by observations in the current planning iteration --- is low. Importantly, we provide performance guarantees on the resulting VoI macro-action policies in the form of bounded regret relative to the optimal policy. In simulated tracking experiments, we achieve higher reward than both closed-loop and hand-coded macro-action baselines, selectively using VoI macro-actions to reduce planning complexity while maintaining near-optimal task performance.
- North America > United States > New York > Onondaga County > Syracuse (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Asia > Singapore (0.05)
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- (2 more...)
Approximate Supermodularity Bounds for Experimental Design
Luiz Chamon, Alejandro Ribeiro
This work provides performance guarantees for the greedy solution of experimental design problems. In particular, it focuses on A-and E-optimal designs, for which typical guarantees do not apply since the mean-square error and the maximum eigenvalue of the estimation error covariance matrix are not supermodular. To do so, it leverages the concept of approximate supermodularity to derive non-asymptotic worst-case suboptimality bounds for these greedy solutions. These bounds reveal that as the SNR of the experiments decreases, these cost functions behave increasingly as supermodular functions.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Pennsylvania (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology (0.45)
- Education (0.45)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- North America > United States > Michigan (0.04)
- North America > United States > California > San Mateo County > Menlo Park (0.04)
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- Asia > Singapore (0.04)
- Asia > Middle East > Jordan (0.04)